Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding BoshTask #231

Merged
merged 20 commits into from
May 4, 2020
Merged

Adding BoshTask #231

merged 20 commits into from
May 4, 2020

Conversation

djarecka
Copy link
Collaborator

@djarecka djarecka commented Apr 9, 2020

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Summary

adding Shell Command Task that uses boutiques descriptor and runs bosh command to execute the task
(see discussion #210)
closes #37

Checklist

  • All tests passing
  • I have added tests to cover my changes
  • I have updated documentation (if necessary)
  • My code follows the code style of this project
    (we are using black: you can pip install pre-commit,
    run pre-commit install in the pydra directory
    and black will be run automatically with each commit)

Acknowledgment

  • I acknowledge that this contribution will be available under the Apache 2 license.

@djarecka
Copy link
Collaborator Author

djarecka commented Apr 9, 2020

@ValHayot @glatard, @satra - I've started creating a draft of BoshTask that uses bosh cli, it's not finished, but I've got stuck with a problem that bosh can't find the input files: I believe that this is related to the way how boutiques creates a docker command. It seems that it assume that the input file is in the working directory, is that right?
I was also testing without pydra and it looks like the same command can work in the home directory, but doesn't work in any directory (even if I provide a full path to the input file).

You can see the output message I get from bosh: https://travis-ci.org/github/djarecka/pydra/jobs/673165870

@glatard
Copy link

glatard commented Apr 10, 2020

Hi @djarecka, it looks like the input files aren't mounted in the tool container. By default, bosh only mounts the current working directory, which might explain why it worked when executed from your home directory. The way to fix this would be to pass a list of mounts, through option -v, for instance: bosh exec launch -v <image_dirname>:<image_dirname> [other params] for each input file.

@djarecka
Copy link
Collaborator Author

@glatard -oh, ok, thanks! somehow I assumed that everything has to be mounted automatically (as in cwl).

@djarecka
Copy link
Collaborator Author

@glatard - could you please tell me what is the easiest way to get information about all the outputs that should be produced for the specific zenodo and and inputs (the invocation file)

@djarecka
Copy link
Collaborator Author

@satra - please check the tests for BoshTask: https://github.com/nipype/pydra/blob/0073e150f649cdf9ee070bd80420f3f4ab88ae03/pydra/engine/tests/test_boutiques.py

as we discussed yesterday I used only zenodo files to create input_spec and output_spec, so at the end I have all names/paths of the output that could be created, but user should be asking only about the output that should be created given set of inputs

@codecov
Copy link

codecov bot commented Apr 22, 2020

Codecov Report

Merging #231 into master will decrease coverage by 2.63%.
The diff coverage is 84.21%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #231      +/-   ##
==========================================
- Coverage   86.70%   84.07%   -2.63%     
==========================================
  Files          17       18       +1     
  Lines        2896     3008     +112     
  Branches      781      808      +27     
==========================================
+ Hits         2511     2529      +18     
- Misses        242      333      +91     
- Partials      143      146       +3     
Flag Coverage Δ
#unittests 84.07% <84.21%> (-2.63%) ⬇️
Impacted Files Coverage Δ
pydra/engine/task.py 79.68% <33.33%> (+0.15%) ⬆️
pydra/engine/boutiques.py 84.76% <84.76%> (ø)
pydra/engine/specs.py 91.69% <100.00%> (+0.15%) ⬆️
pydra/engine/workers.py 38.00% <0.00%> (-47.34%) ⬇️
pydra/__init__.py 86.66% <0.00%> (-6.67%) ⬇️
pydra/engine/submitter.py 87.50% <0.00%> (-3.85%) ⬇️
pydra/engine/helpers.py 87.27% <0.00%> (-1.82%) ⬇️
pydra/engine/helpers_file.py 80.27% <0.00%> (+0.91%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 020181d...65cdd56. Read the comment docs.

@satra
Copy link
Contributor

satra commented Apr 22, 2020

@djarecka - looks reasonable for a start - for the second workflow it would be nice to use the shelltask functionality where stdout is read into a variable so you could actually compare the output of the task.

@djarecka
Copy link
Collaborator Author

@satra - let me know if this is what you wanted me to add

@djarecka
Copy link
Collaborator Author

i'm randomly(?) getting error from json.decoder.JSONDecodeError to some boutiques tests, don't really understand why..

@djarecka
Copy link
Collaborator Author

any idea why sometimes json.load could give an error when reading zenodo files? It complains that it should be ", not ', but I think the file is fine, and the error only happens ~20% cases on travis. Not able to reproduce it locally.

I've added retry bloc...

Copy link
Contributor

@satra satra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments. of particular note is that the interaction between state and task in this class seems very specific. any such interaction should be abstracted away to taskbase.

pydra/engine/boutiques.py Outdated Show resolved Hide resolved
pydra/engine/boutiques.py Outdated Show resolved Hide resolved
if input_spec is None:
input_spec = self._prepare_input_spec()
self.input_spec = input_spec
if output_spec is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems that both for input_spec and output_spec there is no check that these align with the bosh spec if they are provided.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right, haven't even tested it...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decided to introduce input/output_spec_names instead of input/output_spec, so you can provide subset of names that should be used, but type, help_string, etc. is taken from the zenodo spec file. Otherwise I wasn't sure how to deal with conflicts in types, help_strings etc.

pydra/engine/boutiques.py Outdated Show resolved Hide resolved
)
return cmd_list

def _input_file(self, state_ind, ind=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this and the previous function appears to interact with state. that seems odd to me. we should not be expecting people to understand state to write a task. thus far all tasks have been somewhat independent of state, and we should keep it that way.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't understand your comment. pydra prepares the files, this is a private function

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i did not understand why this function is needed and how to convey to another developer that if they were to write the boutiques task they would need this function. what is the general role of this function? and how would another developer (say a CWL developer) know that this is needed. this seems to depend on state_ind.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function creates invocation json file with inputs that is needed for bosh exec launch. If task has splitter and provides two files for example for infile input, I will create two invocation files. I'm assuming that user doesn't know that bosh requires this file, but just provides infile, etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be the kind of thing that happens inside run_task? i.e. the creation of this file is only required when you have the inputs already set. this is not an input to the task itself but to bosh.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but this is used by command_args property that is used in run_task

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should probably use more specific name

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see that ShellCommandTask follows a similar pattern. conceptually does command_args need to work for tasks with State. is it used anywhere in that form?

wouldn't it be easier if these didn't have to think about state for command_args and would raise an Exception if called when the Task has state, but would return the appropriate thing when the Task is stateless.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't answer you right now. I didn't see a problem with it, but can rethink this in a different PR

Copy link
Collaborator Author

@djarecka djarecka Apr 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and I actually do think that it's nice to have, it helps with debugging

@@ -360,9 +359,23 @@ def _field_metadata(self, fld, inputs, output_dir):
if "value" in fld.metadata:
return output_dir / fld.metadata["value"]
elif "output_file_template" in fld.metadata:
return output_dir / fld.metadata["output_file_template"].format(
**inputs.__dict__
sfx_tmpl = (output_dir / fld.metadata["output_file_template"]).suffixes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do you detect suffixes. for imaging files or other modalities this may need to be quite custom? for example nii.gz, BRIK/HEAD, etc.,.

how will pydra generalize this across domains?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my main problem was that input_1 could be filename, filename.nii or filename.nii.gz and output should be always filename.nii.gz, and the templaete is [input_1].nii.gz...
so I decided to remove all suffixes if template has it's own, happy to hear better suggestions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is that specific to bosh or boutiques that output is always compressed nifti? also does bosh/boutiques enforce nifti as output? i don't think anything in boutiques enforces it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you see example for bet - you will see that "output-files" has "path-template": "[MASK].nii.gz", but maskfile (that has "value-key": "[MASK]"), could be either filename or filename.nii.gz. For both cases output-files is filename.nii.gz

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that may be true of bet, but boutiques can support other tools as well. so there needs to be a general way of deciding what the outputs would be.

this is the part where knowing what outputs should be created given the inputs plays a role, but i don't think that's captured in boutiques yet. so as a first pass, you can simply leave inputs/outputs as separate things.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understood that it's not always a case, but I thought this is a good start. Not sure how do you want me to keep it separately. From the zenodo file I have only a template that uses inputs values

…sk: this allows for providing subset of names that should be used, but type, help_string, etc. is taken from the zenodo spec file
@djarecka djarecka changed the title [WIP] Adding BoshTask Adding BoshTask May 4, 2020
@djarecka djarecka merged commit 9d1e27f into nipype:master May 4, 2020
@djarecka djarecka deleted the enh/bosh branch December 30, 2022 20:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Import/Export interoperability specs - Boutiques
3 participants